A New Feature Selection Technique Combined with ELM Feature Space for Text Classification
نویسندگان
چکیده
The aim of text classification is to classify the text documents into a set of pre-defined categories. But the complexity of natural languages, high dimensional feature space and low quality of feature selection become the main problem for text classification process. Hence, in order strengthen the classification technique, selection of important features, and consequently removing the unimportant ones is the need of the day. The Paper proposes an approach called Commonality-Rarity Score Computation (CRSC) for selecting top features of a corpus and highlights the importance of ML-ELM feature space in the domain of text classification. Experimental results on two benchmark datasets signify the prominence of the proposed approach compared to other established approaches.
منابع مشابه
An Improved Flower Pollination Algorithm with AdaBoost Algorithm for Feature Selection in Text Documents Classification
In recent years, production of text documents has seen an exponential growth, which is the reason why their proper classification seems necessary for better access. One of the main problems of classifying text documents is working in high-dimensional feature space. Feature Selection (FS) is one of the ways to reduce the number of text attributes. So, working with a great bulk of the feature spa...
متن کاملA New Approach for Text Documents Classification with Invasive Weed Optimization and Naive Bayes Classifier
With the fast increase of the documents, using Text Document Classification (TDC) methods has become a crucial matter. This paper presented a hybrid model of Invasive Weed Optimization (IWO) and Naive Bayes (NB) classifier (IWO-NB) for Feature Selection (FS) in order to reduce the big size of features space in TDC. TDC includes different actions such as text processing, feature extraction, form...
متن کاملAn Improved Flower Pollination Algorithm with AdaBoost Algorithm for Feature Selection in Text Documents Classification
In recent years, production of text documents has seen an exponential growth, which is the reason why their proper classification seems necessary for better access. One of the main problems of classifying text documents is working in high-dimensional feature space. Feature Selection (FS) is one of the ways to reduce the number of text attributes. So, working with a great bulk of the feature spa...
متن کاملAn Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
متن کاملImproving the Operation of Text Categorization Systems with Selecting Proper Features Based on PSO-LA
With the explosive growth in amount of information, it is highly required to utilize tools and methods in order to search, filter and manage resources. One of the major problems in text classification relates to the high dimensional feature spaces. Therefore, the main goal of text classification is to reduce the dimensionality of features space. There are many feature selection methods. However...
متن کامل